112 research outputs found

    Robust Anomaly Detection with Applications to Acoustics and Graphs

    Get PDF
    Our goal is to develop a robust anomaly detector that can be incorporated into pattern recognition systems that may need to learn, but will never be shunned for making egregious errors. The ability to know what we do not know is a concept often overlooked when developing classifiers to discriminate between different types of normal data in controlled experiments. We believe that an anomaly detector should be used to produce warnings in real applications when operating conditions change dramatically, especially when other classifiers only have a fixed set of bad candidates from which to choose. Our approach to distributional anomaly detection is to gather local information using features tailored to the domain, aggregate all such evidence to form a global density estimate, and then compare it to a model of normal data. A good match to a recognizable distribution is not required. By design, this process can detect the "unknown unknowns" [1] and properly react to the "black swan events" [2] that can have devastating effects on other systems. We demonstrate that our system is robust to anomalies that may not be well-defined or well-understood even if they have contaminated the training data that is assumed to be non-anomalous. In order to develop a more robust speech activity detector, we reformulate the problem to include acoustic anomaly detection and demonstrate state-of-the-art performance using simple distribution modeling techniques that can be used at incredibly high speed. We begin by demonstrating our approach when training on purely normal conversational speech and then remove all annotation from our training data and demonstrate that our techniques can robustly accommodate anomalous training data contamination. When comparing continuous distributions in higher dimensions, we develop a novel method of discarding portions of a semi-parametric model to form a robust estimate of the Kullback-Leibler divergence. Finally, we demonstrate the generality of our approach by using the divergence between distributions of vertex invariants as a graph distance metric and achieve state-of-the-art performance when detecting graph anomalies with neighborhoods of excessive or negligible connectivity. [1] D. Rumsfeld. (2002) Transcript: DoD news briefing - Secretary Rumsfeld and Gen. Myers. [2] N. N. Taleb, The Black Swan: The Impact of the Highly Improbable. Random House, 2007

    Manual transcription of conversational speech at the articulatory feature level

    Get PDF
    Although much is known about how speech is produced, and research into speech production has resulted in measured articulatory data, feature systems of different kinds and numerous models, speech production knowledge is almost totally ignored in current mainstream approaches to automatic speech recognition. Representations of speech production allow simple explanations for many phenomena observed in speech which cannot be easily analyzed from either acoustic signal or phonetic transcription alone. In this article, we provide a survey of a growing body of work in which such representations are used to improve automatic speech recognition

    Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop.

    Get PDF
    We report on investigations, conducted at the 2006 Johns HopkinsWorkshop, into the use of articulatory features (AFs) for observation and pronunciation models in speech recognition. In the area of observation modeling, we use the outputs of AF classiers both directly, in an extension of hybrid HMM/neural network models, and as part of the observation vector, an extension of the tandem approach. In the area of pronunciation modeling, we investigate a model having multiple streams of AF states with soft synchrony constraints, for both audio-only and audio-visual recognition. The models are implemented as dynamic Bayesian networks, and tested on tasks from the Small-Vocabulary Switchboard (SVitchboard) corpus and the CUAVE audio-visual digits corpus. Finally, we analyze AF classication and forced alignment using a newly collected set of feature-level manual transcriptions

    The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics.

    Get PDF
    ABSTRACT: A global genome database of all of Earth’s species diversity could be a treasure trove of scientific discoveries. However, regardless of the major advances in genome sequencing technologies, only a tiny fraction of species have genomic information available. To contribute to a more complete planetary genomic database, scientists and institutions across the world have united under the Earth BioGenome Project (EBP), which plans to sequence and assemble high-quality reference genomes for all ∼1.5 million recognized eukaryotic species through a stepwise phased approach. As the initiative transitions into Phase II, where 150,000 species are to be sequenced in just four years, worldwide participation in the project will be fundamental to success. As the European node of the EBP, the European Reference Genome Atlas (ERGA) seeks to implement a new decentralised, accessible, equitable and inclusive model for producing high-quality reference genomes, which will inform EBP as it scales. To embark on this mission, ERGA launched a Pilot Project to establish a network across Europe to develop and test the first infrastructure of its kind for the coordinated and distributed reference genome production on 98 European eukaryotic species from sample providers across 33 European countries. Here we outline the process and challenges faced during the development of a pilot infrastructure for the production of reference genome resources, and explore the effectiveness of this approach in terms of high-quality reference genome production, considering also equity and inclusion. The outcomes and lessons learned during this pilot provide a solid foundation for ERGA while offering key learnings to other transnational and national genomic resource projects.info:eu-repo/semantics/publishedVersio

    Effect of sitagliptin on cardiovascular outcomes in type 2 diabetes

    Get PDF
    BACKGROUND: Data are lacking on the long-term effect on cardiovascular events of adding sitagliptin, a dipeptidyl peptidase 4 inhibitor, to usual care in patients with type 2 diabetes and cardiovascular disease. METHODS: In this randomized, double-blind study, we assigned 14,671 patients to add either sitagliptin or placebo to their existing therapy. Open-label use of antihyperglycemic therapy was encouraged as required, aimed at reaching individually appropriate glycemic targets in all patients. To determine whether sitagliptin was noninferior to placebo, we used a relative risk of 1.3 as the marginal upper boundary. The primary cardiovascular outcome was a composite of cardiovascular death, nonfatal myocardial infarction, nonfatal stroke, or hospitalization for unstable angina. RESULTS: During a median follow-up of 3.0 years, there was a small difference in glycated hemoglobin levels (least-squares mean difference for sitagliptin vs. placebo, -0.29 percentage points; 95% confidence interval [CI], -0.32 to -0.27). Overall, the primary outcome occurred in 839 patients in the sitagliptin group (11.4%; 4.06 per 100 person-years) and 851 patients in the placebo group (11.6%; 4.17 per 100 person-years). Sitagliptin was noninferior to placebo for the primary composite cardiovascular outcome (hazard ratio, 0.98; 95% CI, 0.88 to 1.09; P<0.001). Rates of hospitalization for heart failure did not differ between the two groups (hazard ratio, 1.00; 95% CI, 0.83 to 1.20; P = 0.98). There were no significant between-group differences in rates of acute pancreatitis (P = 0.07) or pancreatic cancer (P = 0.32). CONCLUSIONS: Among patients with type 2 diabetes and established cardiovascular disease, adding sitagliptin to usual care did not appear to increase the risk of major adverse cardiovascular events, hospitalization for heart failure, or other adverse events

    Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks in 188 countries, 1990-2013: A systematic analysis for the Global Burden of Disease Study 2013

    Get PDF
    Background: The Global Burden of Disease, Injuries, and Risk Factor study 2013 (GBD 2013) is the first of a series of annual updates of the GBD. Risk factor quantification, particularly of modifiable risk factors, can help to identify emerging threats to population health and opportunities for prevention. The GBD 2013 provides a timely opportunity to update the comparative risk assessment with new data for exposure, relative risks, and evidence on the appropriate counterfactual risk distribution. Methods: Attributable deaths, years of life lost, years lived with disability, and disability-adjusted life-years (DALYs) have been estimated for 79 risks or clusters of risks using the GBD 2010 methods. Risk-outcome pairs meeting explicit evidence criteria were assessed for 188 countries for the period 1990-2013 by age and sex using three inputs: risk exposure, relative risks, and the theoretical minimum risk exposure level (TMREL). Risks are organised into a hierarchy with blocks of behavioural, environmental and occupational, and metabolic risks at the first level of the hierarchy. The next level in the hierarchy includes nine clusters of related risks and two individual risks, with more detail provided at levels 3 and 4 of the hierarchy. Compared with GBD 2010, six new risk factors have been added: handwashing practices, occupational exposure to trichloroethylene, childhood wasting, childhood stunting, unsafe sex, and low glomerular filtration rate. For most risks, data for exposure were synthesised with a Bayesian metaregression method, DisMod-MR 2.0, or spatial-temporal Gaussian process regression. Relative risks were based on meta-regressions of published cohort and intervention studies. Attributable burden for clusters of risks and all risks combined took into account evidence on the mediation of some risks such as high body-mass index (BMI) through other risks such as high systolic blood pressure and high cholesterol. Findings: All risks combined account for 57·2% (95% uncertainty interval [UI] 55·8-58·5) of deaths and 41·6% (40·1-43·0) of DALYs. Risks quantified account for 87·9% (86·5-89·3) of cardiovascular disease DALYs, ranging to a low of 0% for neonatal disorders and neglected tropical diseases and malaria. In terms of global DALYs in 2013, six risks or clusters of risks each caused more than 5% of DALYs: dietary risks accounting for 11·3 million deaths and 241·4 million DALYs, high systolic blood pressure for 10·4 million deaths and 208·1 million DALYs, child and maternal malnutrition for 1·7 million deaths and 176·9 million DALYs, tobacco smoke for 6·1 million deaths and 143·5 million DALYs, air pollution for 5·5 million deaths and 141·5 million DALYs, and high BMI for 4·4 million deaths and 134·0 million DALYs. Risk factor patterns vary across regions and countries and with time. In sub-Saharan Africa, the leading risk factors are child and maternal malnutrition, unsafe sex, and unsafe water, sanitation, and handwashing. In women, in nearly all countries in the Americas, north Africa, and the Middle East, and in many other high-income countries, high BMI is the leading risk factor, with high systolic blood pressure as the leading risk in most of Central and Eastern Europe and south and east Asia. For men, high systolic blood pressure or tobacco use are the leading risks in nearly all high-income countries, in north Africa and the Middle East, Europe, and Asia. For men and women, unsafe sex is the leading risk in a corridor from Kenya to South Africa. Interpretation: Behavioural, environmental and occupational, and metabolic risks can explain half of global mortality and more than one-third of global DALYs providing many opportunities for prevention. Of the larger risks, the attributable burden of high BMI has increased in the past 23 years. In view of the prominence of behavioural risk factors, behavioural and social science research on interventions for these risks should be strengthened. Many prevention and primary care policy options are available now to act on key risks

    Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: A systematic analysis for the Global Burden of Disease Study 2015

    Get PDF
    Background: The Global Burden of Diseases, Injuries, and Risk Factors Study 2015 provides an up-to-date synthesis of the evidence for risk factor exposure and the attributable burden of disease. By providing national and subnational assessments spanning the past 25 years, this study can inform debates on the importance of addressing risks in context. Methods: We used the comparative risk assessment framework developed for previous iterations of the Global Burden of Disease Study to estimate attributable deaths, disability-adjusted life-years (DALYs), and trends in exposure by age group, sex, year, and geography for 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks from 1990 to 2015. This study included 388 risk-outcome pairs that met World Cancer Research Fund-defined criteria for convincing or probable evidence. We extracted relative risk and exposure estimates from randomised controlled trials, cohorts, pooled cohorts, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. We developed a metric that allows comparisons of exposure across risk factors—the summary exposure value. Using the counterfactual scenario of theoretical minimum risk level, we estimated the portion of deaths and DALYs that could be attributed to a given risk. We decomposed trends in attributable burden into contributions from population growth, population age structure, risk exposure, and risk-deleted cause-specific DALY rates. We characterised risk exposure in relation to a Socio-demographic Index (SDI). Findings: Between 1990 and 2015, global exposure to unsafe sanitation, household air pollution, childhood underweight, childhood stunting, and smoking each decreased by more than 25%. Global exposure for several occupational risks, high body-mass index (BMI), and drug use increased by more than 25% over the same period. All risks jointly evaluated in 2015 accounted for 57·8% (95% CI 56·6–58·8) of global deaths and 41·2% (39·8–42·8) of DALYs. In 2015, the ten largest contributors to global DALYs among Level 3 risks were high systolic blood pressure (211·8 million [192·7 million to 231·1 million] global DALYs), smoking (148·6 million [134·2 million to 163·1 million]), high fasting plasma glucose (143·1 million [125·1 million to 163·5 million]), high BMI (120·1 million [83·8 million to 158·4 million]), childhood undernutrition (113·3 million [103·9 million to 123·4 million]), ambient particulate matter (103·1 million [90·8 million to 115·1 million]), high total cholesterol (88·7 million [74·6 million to 105·7 million]), household air pollution (85·6 million [66·7 million to 106·1 million]), alcohol use (85·0 million [77·2 million to 93·0 million]), and diets high in sodium (83·0 million [49·3 million to 127·5 million]). From 1990 to 2015, attributable DALYs declined for micronutrient deficiencies, childhood undernutrition, unsafe sanitation and water, and household air pollution; reductions in risk-deleted DALY rates rather than reductions in exposure drove these declines. Rising exposure contributed to notable increases in attributable DALYs from high BMI, high fasting plasma glucose, occupational carcinogens, and drug use. Environmental risks and childhood undernutrition declined steadily with SDI; low physical activity, high BMI, and high fasting plasma glucose increased with SDI. In 119 countries, metabolic risks, such as high BMI and fasting plasma glucose, contributed the most attributable DALYs in 2015. Regionally, smoking still ranked among the leading five risk factors for attributable DALYs in 109 countries; childhood underweight and unsafe sex remained primary drivers of early death and disability in much of sub-Saharan Africa. Interpretation: Declines in some key environmental risks have contributed to declines in critical infectious diseases. Some risks appear to be invariant to SDI. Increasing risks, including high BMI, high fasting plasma glucose, drug use, and some occupational exposures, contribute to rising burden from some conditions, but also provide opportunities for intervention. Some highly preventable risks, such as smoking, remain major causes of attributable DALYs, even as exposure is declining. Public policy makers need to pay attention to the risks that are increasingly major contributors to global burden. Funding: Bill & Melinda Gates Foundation

    Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990-2013: A systematic analysis for the Global Burden of Disease Study 2013

    Get PDF
    Background Up-to-date evidence on levels and trends for age-sex-specific all-cause and cause-specific mortality is essential for the formation of global, regional, and national health policies. In the Global Burden of Disease Study 2013 (GBD 2013) we estimated yearly deaths for 188 countries between 1990, and 2013. We used the results to assess whether there is epidemiological convergence across countries. Methods We estimated age-sex-specific all-cause mortality using the GBD 2010 methods with some refinements to improve accuracy applied to an updated database of vital registration, survey, and census data. We generally estimated cause of death as in the GBD 2010. Key improvements included the addition of more recent vital registration data for 72 countries, an updated verbal autopsy literature review, two new and detailed data systems for China, and more detail for Mexico, UK, Turkey, and Russia. We improved statistical models for garbage code redistribution. We used six different modelling strategies across the 240 causes; cause of death ensemble modelling (CODEm) was the dominant strategy for causes with sufficient information. Trends for Alzheimer's disease and other dementias were informed by meta-regression of prevalence studies. For pathogen-specific causes of diarrhoea and lower respiratory infections we used a counterfactual approach. We computed two measures of convergence (inequality) across countries: the average relative difference across all pairs of countries (Gini coefficient) and the average absolute difference across countries. To summarise broad findings, we used multiple decrement life-tables to decompose probabilities of death from birth to exact age 15 years, from exact age 15 years to exact age 50 years, and from exact age 50 years to exact age 75 years, and life expectancy at birth into major causes. For all quantities reported, we computed 95% uncertainty intervals (UIs). We constrained cause-specific fractions within each age-sex-country-year group to sum to all-cause mortality based on draws from the uncertainty distributions. Findings Global life expectancy for both sexes increased from 65·3 years (UI 65·0-65·6) in 1990, to 71·5 years (UI 71·0-71·9) in 2013, while the number of deaths increased from 47·5 million (UI 46·8-48·2) to 54·9 million (UI 53·6-56·3) over the same interval. Global progress masked variation by age and sex: for children, average absolute differences between countries decreased but relative differences increased. For women aged 25-39 years and older than 75 years and for men aged 20-49 years and 65 years and older, both absolute and relative differences increased. Decomposition of global and regional life expectancy showed the prominent role of reductions in age-standardised death rates for cardiovascular diseases and cancers in high-income regions, and reductions in child deaths from diarrhoea, lower respiratory infections, and neonatal causes in low-income regions. HIV/AIDS reduced life expectancy in southern sub-Saharan Africa. For most communicable causes of death both numbers of deaths and age-standardised death rates fell whereas for most non-communicable causes, demographic shifts have increased numbers of deaths but decreased age-standardised death rates. Global deaths from injury increased by 10·7%, from 4·3 million deaths in 1990 to 4·8 million in 2013; but age-standardised rates declined over the same period by 21%. For some causes of more than 100 000 deaths per year in 2013, age-standardised death rates increased between 1990 and 2013, including HIV/AIDS, pancreatic cancer, atrial fibrillation and flutter, drug use disorders, diabetes, chronic kidney disease, and sickle-cell anaemias. Diarrhoeal diseases, lower respiratory infections, neonatal causes, and malaria are still in the top five causes of death in children younger than 5 years. The most important pathogens are rotavirus for diarrhoea and pneumococcus for lower respiratory infections. Country-specific probabilities of death over three phases of life were substantially varied between and within regions. Interpretation For most countries, the general pattern of reductions in age-sex specific mortality has been associated with a progressive shift towards a larger share of the remaining deaths caused by non-communicable disease and injuries. Assessing epidemiological convergence across countries depends on whether an absolute or relative measure of inequality is used. Nevertheless, age-standardised death rates for seven substantial causes are increasing, suggesting the potential for reversals in some countries. Important gaps exist in the empirical data for cause of death estimates for some countries; for example, no national data for India are available for the past decade. Funding Bill &amp; Melinda Gates Foundation
    corecore